refactor(schema): split lead-scoring schema into schemes/lead_scoring/ [LTV-Pg.2]#112
Merged
Merged
Conversation
…/ [LTV-Pg.2] Completes the M2 physical reorg (LTV-Pg). Pulls the lead-scoring-specific schema definitions out of the shared `schema/` package, leaving only genuine cross-scheme primitives behind. New files (schemes/lead_scoring/): - entities.py — ContactRow … SubscriptionRow + ALL_ROW_TYPES / TABLE_NAMES - relationships.py — ALL_CONSTRAINTS - features.py — LEAD_SNAPSHOT_FEATURES + redacted_columns_for - tasks.py — CONVERTED_WITHIN_90_DAYS + task_manifest_for_config Shared schema/ after the split (only primitives remain): - entities.py — EntityRowProtocol, make_empty_dataframe, AccountRow - features.py — FeatureSpec - relationships.py — FKConstraint, FKViolationError, validate_fk - tasks.py — SplitSpec, TaskManifest All callers updated (36 files): multi-line from-import blocks rewritten via perl; 4 mixed imports (tests/exposure/test_redaction.py, tests/schema/ test_relationships.py, tests/render/test_render.py, tests/narrative/ test_dataset_card.py) fixed manually where FeatureSpec/validate_fk/TaskManifest stay shared but co-imported with moved symbols. tests/schemes/test_module_layout.py: 3 new tests for Pg.2 — primitives-stay, scheme-specifics-in-scheme, removed-from-shared-schema. CHANGELOG, CLAUDE.md (canonical layout), roadmap (Pg.2 ✓), agent-plan updated. Verified byte-identical to pre-reorg main (14/14 files); full suite 1537 passed / 51 skipped; ruff + mypy clean (96 source files). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Pull request overview
This PR completes the LTV-M2 / LTV-Pg.2 physical reorg by moving lead-scoring-specific schema definitions out of leadforge/schema/ into leadforge/schemes/lead_scoring/, leaving only genuinely cross-scheme primitives in schema/. It also updates imports across the codebase and adds tests that lock in the intended module boundaries.
Changes:
- Introduces
leadforge/schemes/lead_scoring/{entities,features,relationships,tasks}.pyand migrates lead-scoring schema catalogs/constants into them. - Refactors
leadforge/schema/{entities,features,relationships,tasks}.pyto retain only shared primitives and updates call sites accordingly. - Adds/updates tests and documentation to enforce and describe the new module layout.
Reviewed changes
Copilot reviewed 38 out of 38 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/validation/test_realism.py | Updates imports to pull lead-scoring feature catalog from the scheme package. |
| tests/test_primary_task_threading.py | Updates task imports to use lead-scoring scheme tasks module. |
| tests/simulation/test_engine.py | Updates entity imports to scheme-scoped lead-scoring rows. |
| tests/scripts/test_build_v7_snapshot.py | Updates entity imports (TouchRow) to scheme module. |
| tests/schemes/test_module_layout.py | Adds layout-lock assertions for shared-vs-scheme boundaries. |
| tests/schemes/lifecycle/test_entities.py | Updates imports for lead-scoring catalogs used in lifecycle tests. |
| tests/schema/test_tasks.py | Updates imports for moved task manifest symbol(s). |
| tests/schema/test_relationships.py | Imports lead-scoring ALL_CONSTRAINTS from scheme module while keeping shared primitives in schema. |
| tests/schema/test_features.py | Updates imports for moved lead-scoring feature catalog and redaction helper. |
| tests/schema/test_entities.py | Updates imports for moved lead-scoring entity rows/catalogs. |
| tests/render/test_render.py | Updates imports to scheme-scoped entities/features/constraints. |
| tests/narrative/test_dataset_card.py | Splits shared task primitive imports vs lead-scoring task factory import. |
| tests/exposure/test_redaction.py | Updates imports to scheme-scoped lead-scoring feature catalog/redaction helper. |
| leadforge/validation/release_quality.py | Updates imports to scheme-scoped lead-scoring feature catalog. |
| leadforge/validation/realism.py | Updates imports and doc references to scheme-scoped lead-scoring feature catalog/entities. |
| leadforge/validation/invariants.py | Updates imports to scheme-scoped lead-scoring redaction helper. |
| leadforge/validation/bundle_checks.py | Updates imports to scheme-scoped lead-scoring features + constraints. |
| leadforge/schemes/lifecycle/entities.py | Updates doc references to lead-scoring entity rows under scheme module. |
| leadforge/schemes/lead_scoring/tasks.py | New: lead-scoring task manifest constant + config-to-manifest factory. |
| leadforge/schemes/lead_scoring/simulation/state.py | Updates doc reference to scheme-scoped LeadRow. |
| leadforge/schemes/lead_scoring/simulation/population.py | Updates entity imports to scheme-scoped lead-scoring rows. |
| leadforge/schemes/lead_scoring/simulation/engine.py | Updates imports/doc references to scheme-scoped entity rows. |
| leadforge/schemes/lead_scoring/render/tasks.py | Updates task imports to scheme-scoped lead-scoring tasks module. |
| leadforge/schemes/lead_scoring/render/snapshots.py | Updates entity/features imports to scheme-scoped modules. |
| leadforge/schemes/lead_scoring/render/relational.py | Updates entity imports to scheme-scoped lead-scoring rows. |
| leadforge/schemes/lead_scoring/relationships.py | New: lead-scoring FK constraint catalog. |
| leadforge/schemes/lead_scoring/features.py | New: lead-scoring feature catalog + redaction helper. |
| leadforge/schemes/lead_scoring/entities.py | New: lead-scoring entity rows + catalogs. |
| leadforge/schemes/lead_scoring/init.py | Updates bundle writer imports to scheme-scoped schema modules. |
| leadforge/schema/tasks.py | Removes lead-scoring task definitions; keeps shared primitives only. |
| leadforge/schema/relationships.py | Removes lead-scoring constraint catalog; keeps shared primitives only. |
| leadforge/schema/features.py | Removes lead-scoring feature catalog/redaction helper; keeps FeatureSpec only. |
| leadforge/schema/entities.py | Removes lead-scoring entity rows/catalogs; keeps shared primitives + AccountRow. |
| leadforge/schema/dictionaries.py | Updates feature catalog import source for dictionary generation. |
| leadforge/narrative/dataset_card.py | Updates imports for feature catalog used in dataset card rendering. |
| docs/ltv/roadmap.md | Marks LTV-Pg.2 as complete and links PR. |
| CLAUDE.md | Updates repository layout documentation to reflect the new shared-schema boundary. |
| .agent-plan.md | Updates planning notes to reflect Pg.2 opened/merged state. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+10
to
+14
| @@ -11,7 +11,7 @@ | |||
|
|
|||
| import pandas as pd | |||
|
|
|||
| from leadforge.schema.features import LEAD_SNAPSHOT_FEATURES, FeatureSpec | |||
| from leadforge.schemes.lead_scoring.features import LEAD_SNAPSHOT_FEATURES, FeatureSpec | |||
| @@ -4,7 +4,7 @@ | |||
|
|
|||
| import pytest | |||
|
|
|||
| from leadforge.schema.tasks import CONVERTED_WITHIN_90_DAYS, SplitSpec | |||
| from leadforge.schemes.lead_scoring.tasks import CONVERTED_WITHIN_90_DAYS, SplitSpec | |||
Comment on lines
8
to
11
|
|
||
| from leadforge.schema.dictionaries import feature_dictionary_df, write_feature_dictionary | ||
| from leadforge.schema.features import LEAD_SNAPSHOT_FEATURES, FeatureSpec | ||
| from leadforge.schemes.lead_scoring.features import LEAD_SNAPSHOT_FEATURES, FeatureSpec | ||
|
|
Comment on lines
+8
to
12
| from leadforge.schema.tables import read_parquet, write_parquet | ||
| from leadforge.schemes.lead_scoring.entities import ( | ||
| ALL_ROW_TYPES, | ||
| TABLE_NAMES, | ||
| AccountRow, |
Comment on lines
14
to
17
| from leadforge.schema.tables import read_parquet, write_parquet | ||
| from leadforge.schemes.lead_scoring.entities import ALL_ROW_TYPES, TABLE_NAMES, AccountRow | ||
| from leadforge.schemes.lead_scoring.relationships import ALL_CONSTRAINTS, FKConstraint | ||
| from leadforge.schemes.lifecycle.entities import ( |
| from typing import TYPE_CHECKING | ||
|
|
||
| from leadforge.schema.features import LEAD_SNAPSHOT_FEATURES, FeatureSpec | ||
| from leadforge.schemes.lead_scoring.features import LEAD_SNAPSHOT_FEATURES, FeatureSpec |
Three concrete fixes after hostile self-review of #112: 1. dictionaries.py imported FeatureSpec from the lead-scoring scheme module — a shared-envelope module pulling a shared primitive via a scheme-specific path (same smell as the _empty_df→make_empty_dataframe catch in Pg.1). Fixed: FeatureSpec now imported from schema.features (its actual home); only LEAD_SNAPSHOT_FEATURES comes from schemes.lead_scoring.features. 2. redacted_columns_for signature silently changed from `features: tuple[FeatureSpec, ...] = LEAD_SNAPSHOT_FEATURES` to `features: tuple[FeatureSpec, ...] | None = None` (behavioral change in a refactor — None was previously a TypeError, now silently substitutes the default). Restored the original honest signature; moved the function to after LEAD_SNAPSHOT_FEATURES in the file to avoid the forward-reference problem that caused the change in the first place. 3. Lead-scoring schema tests moved from tests/schema/ to tests/schemes/lead_scoring/ for consistency with the lifecycle precedent (test_lifecycle_entities.py → tests/schemes/lifecycle/test_entities.py in Pg.1). Having tests/schema/ test_entities.py import from schemes/lead_scoring/entities was jarring. tests/schema/ now only contains test_ids.py (tests the shared IDs module). 4. Stale docstring in exposure/filters.py updated: referenced leadforge.schema.features.redacted_columns_for which moved. Verified byte-identical (14/14); full suite 1537 passed / 51 skipped; ruff + mypy clean. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
pr-agent-context report: This run includes unresolved review comments on PR #112 in repository https://github.com/leadforge-dev/leadforge
For each unresolved review comment, recommend one of: resolve as irrelevant, accept and implement
the recommended solution, open a separate issue and resolve as out-of-scope for this PR, accept and
implement a different solution, or resolve as already treated by the code.
After I reply with my decision per item, implement the accepted actions, resolve the corresponding
PR comments, and push all of these changes in a single commit.
# Copilot Comments
## COPILOT-1
Location: leadforge/schema/dictionaries.py
URL: https://github.com/leadforge-dev/leadforge/pull/112#discussion_r3393474594
Status: outdated
Root author: copilot-pull-request-reviewer
Comment:
The module docstring still references `leadforge.schema.features.LEAD_SNAPSHOT_FEATURES`, but that symbol now lives under `leadforge.schemes.lead_scoring.features`. Updating this keeps the documentation accurate after the schema split.
## COPILOT-2
Location: tests/schemes/lead_scoring/test_tasks.py:7
URL: https://github.com/leadforge-dev/leadforge/pull/112#discussion_r3393474639
Status: outdated
Root author: copilot-pull-request-reviewer
Comment:
This test file is described as testing `leadforge.schema.tasks`, but it imports the shared `SplitSpec` primitive via the lead-scoring scheme module. Import `SplitSpec` from `leadforge.schema.tasks` to keep the primitive’s canonical location and avoid an unnecessary dependency on the lead-scoring scheme.
## COPILOT-3
Location: tests/schemes/lead_scoring/test_features.py:11
URL: https://github.com/leadforge-dev/leadforge/pull/112#discussion_r3393474665
Root author: copilot-pull-request-reviewer
Comment:
`FeatureSpec` is a shared primitive that lives in `leadforge.schema.features`. Importing it from `leadforge.schemes.lead_scoring.features` makes the test depend on a scheme-specific module and obscures the intended boundary.
## COPILOT-4
Location: tests/schemes/lead_scoring/test_entities.py:12
URL: https://github.com/leadforge-dev/leadforge/pull/112#discussion_r3393474684
Root author: copilot-pull-request-reviewer
Comment:
`AccountRow` is explicitly a shared entity-row primitive kept in `leadforge.schema.entities`. Importing it from `schemes.lead_scoring.entities` adds an unnecessary dependency on the lead-scoring scheme and makes it easier to accidentally treat shared primitives as scheme-specific.
## COPILOT-5
Location: tests/schemes/lifecycle/test_entities.py:17
URL: https://github.com/leadforge-dev/leadforge/pull/112#discussion_r3393474693
Root author: copilot-pull-request-reviewer
Comment:
`AccountRow` and `FKConstraint` are shared primitives that remain in `leadforge.schema.*`. Importing them from lead-scoring scheme modules couples the lifecycle tests to the lead-scoring scheme and undermines the intended split between shared primitives and scheme-specific catalogs.
## COPILOT-6
Location: leadforge/narrative/dataset_card.py:15
URL: https://github.com/leadforge-dev/leadforge/pull/112#discussion_r3393474722
Root author: copilot-pull-request-reviewer
Comment:
`FeatureSpec` is a shared primitive defined in `leadforge.schema.features`. Importing it from `leadforge.schemes.lead_scoring.features` makes this scheme-agnostic narrative renderer depend on a scheme module unnecessarily.Run metadata: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Completes the M2 physical reorg (
LTV-Pg.2, milestoneLTV-M2). Pullsthe lead-scoring-specific schema definitions out of the shared
schema/packageinto
schemes/lead_scoring/, leaving only genuine cross-scheme primitives behind.What moved
New files in
schemes/lead_scoring/:entities.pyContactRow…SubscriptionRow+ALL_ROW_TYPES/TABLE_NAMESrelationships.pyALL_CONSTRAINTSfeatures.pyLEAD_SNAPSHOT_FEATURES+redacted_columns_fortasks.pyCONVERTED_WITHIN_90_DAYS+task_manifest_for_configWhat stayed in shared
schema/entities.pyEntityRowProtocol,make_empty_dataframe,AccountRowfeatures.pyFeatureSpecrelationships.pyFKConstraint,FKViolationError,validate_fktasks.pySplitSpec,TaskManifestWhy this boundary
AccountRowis genuinely shared: used byLIFECYCLE_ROW_TYPES(lifecycle scheme).FeatureSpecis a shared primitive: lifecycle will defineCUSTOMER_SNAPSHOT_FEATURESwith it.SplitSpec/TaskManifestare abstract descriptors: lifecycle tasks will use them.Callers
36 files updated. 4 had mixed imports (FeatureSpec/validate_fk/TaskManifest stay shared but were co-imported with moved symbols) and needed manual fixing.
Layout-lock tests
3 new assertions in
tests/schemes/test_module_layout.py:schema.*schemes.lead_scoring.*schema.*namespaceVerification
ruff+mypyclean (96 files).BUNDLE_SCHEMA_VERSIONunchanged.M2 complete
With this PR,
LTV-M2(generation-scheme architecture + physical reorg) is fullydone. The package now has peer schemes with their own schema, simulation, render,
and task specifications — the foundation for
LTV-M3(lifecycle customerpopulation).
🤖 Generated with Claude Code